Skip to content

feat: enforce limit-ratio quality gate and bump 0.8.0#4

Merged
SummerOneTwo merged 2 commits intomasterfrom
feat/limit-ratio-verification-0.8.0
Apr 28, 2026
Merged

feat: enforce limit-ratio quality gate and bump 0.8.0#4
SummerOneTwo merged 2 commits intomasterfrom
feat/limit-ratio-verification-0.8.0

Conversation

@SummerOneTwo
Copy link
Copy Markdown
Owner

Summary

  • enforce final-test sampling to prioritize limit-oriented coverage: by default, at least half of generated tests are type=3/4 (extreme/tle) when candidates are sufficient
  • add manifest-backed problem_verify_tests limit_ratio quality gate (default enabled, explicit opt-out via enable_limit_ratio=false), and add tests for pass/fail/default/opt-out behavior
  • sync workflow/docs and bump project/plugin/package versions to 0.8.0

Test plan

  • uv run pytest tests/ -q
  • uv run ruff check .
  • uv run mypy src/
  • claude plugin validate .

Guarantee final generated tests prioritize limit-oriented coverage by requiring at least half type=3/4 cases by default, and verify this via manifest-backed quality checks with an explicit opt-out. Also synchronize workflow docs and plugin/package versions for the 0.8.0 release line.

Made-with: Cursor
Copilot AI review requested due to automatic review settings April 28, 2026 05:23
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR introduces a “limit-oriented coverage” quality gate by enforcing that the final generated test set contains at least 50% extreme/TLE cases (type=3/4) when possible, adds a manifest-backed verification check for the ratio (enabled by default with explicit opt-out), and bumps project/plugin/package versions to 0.8.0.

Changes:

  • Update problem_generate_tests sampling to prioritize type=3/4 for at least half of final tests and emit a .autocode_tests_manifest.json manifest plus ratio stats.
  • Add problem_verify_tests limit_ratio check (default enabled; opt-out via enable_limit_ratio=false) with unit tests for pass/fail/default/opt-out behavior.
  • Sync documentation/workflow guidance and bump version strings to 0.8.0.

Reviewed changes

Copilot reviewed 15 out of 16 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
uv.lock Bumps editable package version to 0.8.0.
pyproject.toml Updates project version to 0.8.0.
src/autocode_mcp/__init__.py Updates __version__ to 0.8.0.
.claude-plugin/plugin.json Bumps plugin manifest version to 0.8.0.
tests/test_packaging.py Updates version assertion to 0.8.0.
tests/test_plugin_manifest.py Updates plugin manifest version assertion to 0.8.0.
src/autocode_mcp/tools/problem.py Enforces limit-case quota in sampling, writes test manifest, and returns new limit-ratio stats.
src/autocode_mcp/tools/test_verify.py Adds manifest-backed limit_ratio verification with default enable + explicit opt-out.
tests/test_tools/test_problem.py Adds tests for limit_ratio verification and the new sampling quota behavior.
src/autocode_mcp/prompts/__init__.py Updates prompts to reflect the new sampling policy.
README.md Documents the new generation policy and verification gate.
skills/autocode-workflow/SKILL.md Updates workflow/quality gate documentation for the 50% extreme/TLE threshold.
agents/autocode-workflow.md Updates agent instructions to enforce the quality requirement during test generation.
scripts/workflow_guard.py Updates workflow guard messaging to reflect the new preference/requirement.
CLAUDE.md Syncs workflow step documentation with the new quality threshold.
CHANGELOG.md Adds 0.8.0 release notes describing the new gate and behavior.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread src/autocode_mcp/tools/problem.py Outdated
Comment thread src/autocode_mcp/tools/problem.py Outdated
Address Copilot review by matching schema wording with actual deterministic ordering and preventing unconditional signature-based de-duplication during final sampling, so enable_dedup=false semantics remain effective.

Made-with: Cursor
@SummerOneTwo SummerOneTwo merged commit 08268de into master Apr 28, 2026
6 checks passed
@SummerOneTwo SummerOneTwo deleted the feat/limit-ratio-verification-0.8.0 branch April 28, 2026 05:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants